1State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, Henan, China
2Xinjiang Key Laboratory of Crop Gene Editing and Germplasm Innovation, Institute of Western Agricultural of CAAS, Changji, 831100, Xinjiang, China
3Engineering Research Centre of Cotton, Ministry of Education/College of Agriculture, Xinjiang Agricultural University, 311 Nongda East Road, Urumqi, 830052, China
4College of Smart Agriculture (Research Institute), Xinjiang University, Urumqi, 830046, Xinjiang, China
5The Key Laboratory of Oasis Eco-Agriculture, The Xinjiang Production and Construction Corps/College of Agronomy, Shihezi University, Shihezi, 832003, Xinjiang, China
6Xinjiang Key Laboratory of Special Species Conservation and Regulatory Biology, College of Life Science, Xinjiang Normal University, Urumqi, 830017, China
7These authors have contributed equally to this work.
| Received 25 Apr 2025 |
Accepted 17 Sep 2025 |
Published 22 Sep 2025 |
Pre-harvest defoliation of cotton is a key agricultural measure to improve mechanical harvesting efficiency and raw cotton purity. Collecting data on cotton defoliation traits for genetic localization and thus breeding defoliation-prone varieties is an essential alternative to traditional defoliant spraying. Nevertheless, it is hampered by low throughput and artificial error in manual field surveys. In this study, a framework for collecting high-throughput defoliation data in large fields was established. Three spectral indices (MTCI, VDVI, CI) and leaf area index (LAI) were first screened as core predictors through hierarchical segmentation analysis in three levels: leaf number (LN), leaf number difference (LND), and defoliation rate (DR). Four deep learning architectures (CNN, BiGRU, CNN-BiGRU, and CNN-BiGRU-Attention) were developed, and the CNN-BiGRU-Attention hybrid model demonstrated superior performance at all three levels, with R2 values exceeding 0.85. Importantly, the inversion accuracy of this model at the LN and LND levels was superior to that at the DR level, which was also confirmed by the results of the genome-wide association study (GWAS). We combined GWAS and transcriptome results to identify a new gene, GhDR_UAV1, associated with defoliation traits. The overexpression of GhDR_UAV1 significantly promoted the wilting of cotton leaves, indicating that GhDR_UAV1 plays a positive regulatory role in cotton defoliation. This study proposed a strategy to invert cotton defoliation data at three levels using deep learning fusion of UAV remote sensing data and LAI data and confirmed that LND can provide accurate phenotypic data for GWAS analysis. This study provides a new theoretical basis for cotton defoliation regulation and genetic improvement by integrating cotton high-throughput defoliation phenomics and genomics from an innovative perspective.